Eléments de l'association
|
List of bibliographic references
Number of relevant bibliographic references: 78.Ident. | Authors (with country if any) | Title |
---|---|---|
000453 | Manel Tagorti [France] ; Bruno Scherrer [France] | On the Rate of Convergence and Error Bounds for LSTD(λ) |
000454 | Boris Lesner [France] ; Bruno Scherrer [France] | Non-Stationary Approximate Modified Policy Iteration |
000722 | Bruno Scherrer [France] ; Mohammad Ghavamzadeh [France] ; Victor Gabillon [France] ; Boris Lesner [France] ; Matthieu Geist [France] | Approximate Modified Policy Iteration and its Application to the Game of Tetris |
000936 | Bruno Scherrer [France] ; Matthieu Geist [France] | Local Policy Search in a Convex Space and Conservative Policy Iteration as Boosted Policy Search |
000A93 | Bruno Scherrer [France] | Approximate Policy Iteration Schemes: A Comparison |
000B67 | Bruno Scherrer [France] ; Matthieu Geist [France] | Quand l'optimalité locale implique une garantie globale : recherche locale de politique dans un espace convexe et algorithme d'itération sur les politiques conservatif vu comme une montée de gradient fonctionnel |
000B92 | Manel Tagorti [France] ; Bruno Scherrer [France] | Vitesse de convergence et borne d'erreur pour l'algorithme LSTD($\lambda$) |
000B93 | Bruno Scherrer [France] | Une étude comparative de quelques schémas d'approximation de type iterations sur les politiques |
000B95 | Manel Tagorti [France] ; Bruno Scherrer [France] | Rate of Convergence and Error Bounds for LSTD($\lambda$) |
000D16 | Matthieu Geist [France] ; Bruno Scherrer [France] | Off-policy Learning with Eligibility Traces: A Survey |
000D49 | Eugene A. Feinberg [États-Unis] ; Jefferson Huang [États-Unis] ; Bruno Scherrer [France] | Modified policy iteration algorithms are not strongly polynomial for discounted dynamic programming |
000F08 | Bruno Scherrer [France] | Improved and Generalized Upper Bounds on the Complexity of Policy Iteration |
000F09 | Victor Gabillon [France] ; Mohammad Ghavamzadeh [France] ; Bruno Scherrer [France] | Approximate Dynamic Programming Finally Performs Well in the Game of Tetris |
000F29 | Alain Dutech [France] ; Bruno Scherrer [France] ; Christophe Thiery [France] | La carotte et le bâton... et Tetris |
001120 | Bruno Scherrer [France] ; Boris Lesner [France] | Sur l'utilisation de politiques non-stationnaires pour les processus de décision Markoviens à horizon infini |
001122 | Bruno Scherrer [France] | Quelques majorants de la complexité d'itérations sur les politiques |
001130 | Manel Tagorti [France] ; Bruno Scherrer [France] ; Olivier Buffet [France] ; Joerg Hoffmann [France] | Abstraction Pathologies In Markov Decision Processes |
001172 | Manel Tagorti [France] ; Bruno Scherrer [France] ; Olivier Buffet [France] ; Joerg Hoffmann [France] | Abstraction Pathologies In Markov Decision Processes |
001183 | Bruno Scherrer [France] ; Matthieu Geist [France] | Policy Search: Any Local Optimum Enjoys a Global Performance Guarantee |
001194 | Bruno Scherrer [France] | On the Performance Bounds of some Policy Search Dynamic Programming Algorithms |
001244 | Boris Lesner [France] ; Bruno Scherrer [France] | Tight Performance Bounds for Approximate Modified Policy Iteration with Non-Stationary Policies |
001334 | Bruno Scherrer [France] | Performance Bounds for Lambda Policy Iteration and Application to the Game of Tetris |
001750 | Matthieu Geist [France] ; Bruno Scherrer [France] | Off-policy Learning with Eligibility Traces: A Survey |
001825 | Bruno Scherrer [France] ; Boris Lesner [France] | On the Use of Non-Stationary Policies for Stationary Infinite-Horizon Markov Decision Processes |
001A58 | Matthieu Geist [France] ; Bruno Scherrer [France] ; Alessandro Lazaric [France] ; Mohammad Ghavamzadeh [France] | A Dantzig Selector Approach to Temporal Difference Learning |
001A68 | Bruno Scherrer [France] ; Mohammad Ghavamzadeh [France] ; Victor Gabillon [France] ; Matthieu Geist [France] | Approximate Modified Policy Iteration |
001B20 | Matthieu Geist [France] ; Bruno Scherrer [France] ; Alessandro Lazaric [France] ; Mohammad Ghavamzadeh [France] | Un sélecteur de Dantzig pour l'apprentissage par différences temporelles |
001B24 | Bruno Scherrer [France] ; Victor Gabillon [France] ; Mohammad Ghavamzadeh [France] ; Matthieu Geist [France] | Approximations de l'Algorithme Itérations sur les Politiques Modifié |
001B39 | Bruno Scherrer [France] ; Victor Gabillon [France] ; Mohammad Ghavamzadeh [France] ; Matthieu Geist [France] | Approximate Modified Policy Iteration |
001C03 | Bruno Scherrer [France] | On the Use of Non-Stationary Policies for Infinite-Horizon Discounted Markov Decision Processes |
002138 | Matthieu Geist [France] ; Bruno Scherrer [France] | l1-penalized projected Bellman residual |
002139 | Bruno Scherrer [France] ; Matthieu Geist [France] | Recursive Least-Squares Learning with Eligibility Traces |
002267 | Victor Gabillon [France] ; Alessandro Lazaric [France] ; Mohammad Ghavamzadeh [France] ; Bruno Scherrer [France] | Classification-based Policy Iteration with a Critic |
002279 | Bruno Scherrer [France] ; Matthieu Geist [France] | Moindres carrés récursifs pour l'évaluation off-policy d'une politique avec traces d'éligibilité |
002378 | Victor Gabillon [France] ; Alessandro Lazaric [France] ; Mohammad Ghavamzadeh [France] ; Bruno Scherrer [France] | Classification-based Policy Iteration with a Critic |
002841 | Bruno Scherrer [France] | Performance Bounds for Lambda Policy Iteration and Application to the Game of Tetris |
002C27 | Bruno Scherrer [France] | Should one compute the Temporal Difference fix point or minimize the Bellman Residual? The unified oblique projection view |
002C29 | Christophe Thiery [France] ; Bruno Scherrer [France] | Least-Squares λ Policy Iteration: Bias-Variance Trade-off in Control Problems |
002C72 | Christophe Thiery [France] ; Bruno Scherrer [France] | Least-Squares λ Policy Iteration : optimisme et compromis biais-variance pour le contrôle optimal |
003231 | Bruno Scherrer [France] ; Christophe Thiery [France] | Performance bound for Approximate Optimistic Policy Iteration |
003232 | Alain Dutech [France] ; Bruno Scherrer [France] | Partially Observable Markov Decision Processes |
003565 | Christophe Thiery [France] ; Bruno Scherrer [France] | Une approche modifiée de Lambda-Policy Iteration |
003C68 | Christophe Thiery [France] ; Bruno Scherrer [France] | Improvements on Learning Tetris with Cross Entropy |
003C92 | Christophe Thiery [France] ; Bruno Scherrer [France] | Building Controllers for Tetris |
003D48 | Bruno Scherrer [France] ; Shie Mannor [Canada] | Error Reducing Sampling in Reinforcement Learning |
003D49 | Cesar Torres-Huitzil [Mexique] ; Bernard Girau [France] ; Amine Boumaza [France] ; Bruno Scherrer [France] | Embedded harmonic control for trajectory planning in large environments |
003D50 | Marek Petrik [États-Unis] ; Bruno Scherrer [France] | Biasing Approximate Dynamic Programming with a Lower Discount Factor |
004139 | Alain Dutech [France] ; Bruno Scherrer [France] ; Christophe Thiery [France] | La carotte et le bâton... et Tetris |
004474 | Alain Dutech [France] ; Bruno Scherrer [France] | Processus décisionnels de Markov partiellement observables |
004599 | Bernard Girau [France] ; Amine Boumaza [France] ; Bruno Scherrer [France] ; Cesar Torres-Huitzil [Mexique] | Block-synchronous harmonic control for scalable trajectory planning |
004648 | Amine Boumaza [France] ; Bruno Scherrer [France] | Convergence and rate of convergence of simple ant models |
004725 | Amine Boumaza [France] ; Bruno Scherrer [France] | Convergence and Rate of Convergence of a Foraging Ant Model |
004952 | Amine Boumaza [France] ; Bruno Scherrer [France] | Convergence and rate of convergence of a simple ant model |
004958 | Amine Boumaza [France] ; Bruno Scherrer [France] | Optimal control subsumes harmonic control |
004E85 | Amine Boumaza [France] ; Bruno Scherrer [France] | Convergence and rate of convergence of a simple ant model |
004F65 | Bruno Scherrer [France] | Une condition suffisante pour l'implémentation connexionniste asynchrone |
005694 | Amine Boumaza [France] ; Bruno Scherrer [France] | Convergence et taux de convergence d'un algorithme fourmi simple |
005706 | Amine Boumaza [France] ; Bruno Scherrer [France] | Optimal control subsumes harmonic control |
005989 | Amine Boumaza [France] ; Bruno Scherrer [France] | Navigation, fonctions harmoniques et contrôle optimal stochastique |
005C50 | Bruno Scherrer [France] | Asynchronous Neurocomputing for optimal control and reinforcement learning with large state spaces |
006E36 | Bruno Scherrer [France] | Approche connexionniste du contrôle optimal |
007022 | Bruno Scherrer [France] ; Shie Mannor [États-Unis] | Error reducing sampling in reinforcement learning |
007196 | Bruno Scherrer [France] | Modular self-organization for a long-living autonomous agent |
007272 | Bruno Scherrer [France] | Parallel asynchronous distributed computations of optimal control in large state space Markov Decision Processes |
007292 | Bruno Scherrer [France] | Apprentissage de représentation et auto-organisation modulaire pour un agent autonome |
007D73 | Iadine Chadès [France] ; Bruno Scherrer [France] ; François Charpillet [France] | Planning Cooperative Homogeneous Multiagent System Using Markov Decision Processes |
007D84 | Bruno Scherrer [France] | Modular self-organization |
007D85 | Bruno Scherrer [France] | Modular self-organization for a long-living autonomous agent |
008084 | Iadine Chadès [France] ; Bruno Scherrer [France] ; François Charpillet [France] | A Heuristic Approach for Solving Decentralized-POMDP : Assessment on the Pursuit Problem |
008918 | Bruno Scherrer [France] ; Francois Charpillet [France] | Cooperative co-learning: A model-based approach for solving multi agent Reinforcement problems |
008932 | Bruno Scherrer [France] | A connectionist architecture that adapts its representation to complex tasks |
008B58 | Bruno Scherrer [France] ; François Charpillet [France] | Cooperative Co-learning: A Model-based Approach for Solving Multi Agent Reinforcement Problems |
008B66 | Bruno Scherrer [France] ; François Charpillet [France] | Coevolutive Planning In Markov Decision Processes |
008C01 | Bruno Scherrer [France] | A connectionist architecture that adpats its representation to complex tasks |
008C50 | Bruno Scherrer [France] | Auto-organisation modulaire d'une architecture intelligente |
008C52 | Alain Dutech [France] ; Bruno Scherrer [France] | Learning to use contextual information for solving POMDP |
009713 | Iadine Chadès [France] ; Bruno Scherrer [France] ; François Charpillet [France] | A Heuristic Approach for Solving Decentralized-POMDP: Assessment on the Pursuit Problem |
00A054 | Bruno Scherrer [France] ; Frédéric Alexandre [France] ; François Charpillet [France] ; Stéphane Vialle | Modélisation stochastique d'une population de neurones, méta-apprentissage dans un problème de classification |
This area was generated with Dilib version V0.6.33. |